---
title: Forecasting Eye State from Electroencephalogram Readings with MindsDB
sidebarTitle: Brain Activity
---

<Note>
This tutorial uses the Lightwood integration that requires the `mindsdb/mindsdb:lightwood` Docker image. [Learn more here](/setup/self-hosted/docker#install-mindsdb).
</Note>

## Introduction

In this tutorial, we'll create and train a machine learning model, or as we call
it, an `AI Table` or a `predictor`. By querying the model, we'll produce
categorical forecasts for a multivariate time series.

Make sure you have access to a working MindsDB installation, either locally or
at [MindsDB Cloud](https://cloud.mindsdb.com/).

If you want to learn how to set up your account at MindsDB Cloud, follow
[this guide](/setup/cloud/). Another way is to set up
MindsDB locally using
[Docker](/setup/self-hosted/docker/) or
[Python](/setup/self-hosted/pip/source/).

Let's get started.

## Data Setup

### Connecting the Data

There are a couple of ways you can get the data to follow through with this
tutorial.

### Connecting as a database

    You can connect to a demo database that we've prepared for you. It contains the data used throughout this tutorial (the `example_db.demo_data.eeg_eye` table).

    ```sql
    CREATE DATABASE example_db
        WITH ENGINE = "postgres",
        PARAMETERS = {
            "user": "demo_user",
            "password": "demo_password",
            "host": "3.220.66.106",
            "port": "5432",
            "database": "demo"
    };
    ```

    Now you can run queries directly on the demo database. Let's preview the data that we'll use to train our predictor.

    ```sql
    SELECT *
    FROM example_db.demo_data.eeg_eye
    LIMIT 10;
    ```

### Connecting as a file

    The dataset we use in this tutorial is the UCI's *EEG Eye State* dataset. You can download it [here](https://archive.ics.uci.edu/ml/datasets/EEG+Eye+State) in the `ARFF` format that should be converted to the `CSV` format before uploading it via [MindsDB SQL Editor](/connect/mindsdb_editor/).

    Follow [this guide](/sql/create/file/) to find out how to upload a file to MindsDB.

    Now you can run queries directly on the file as if it were a table. Let's preview the data that we'll use to train our predictor.

    ```sql
    SELECT *
    FROM files.eeg_eye
    LIMIT 10;
    ```

<Warning>
  Pay Attention to the Queries" From now on, we'll use the `files.eeg_eye` file
  as a table. Make sure you replace it with `example_db.demo_data.eeg_eye` if
  you connect the data as a database.
</Warning>

### Understanding the Data

We use the UCI's _EEG Eye State_ dataset, where each row contains data of one
electroencephalogram (EEG) reading plus the current state of the patient's eye,
where `0` indicates open eye and `1` indicates closed eye. We want to know ahead
of time when the eye state will change, so we predict the `eyeDetection` column.

Below is the sample data stored in the `files.eeg_eye` table.

```sql
+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+--------------+
| AF3      | F7       | F3       | FC5      | T7       | P7       | O1       | O2       | P8       | T8       | FC6      | F4       | F8       | AF4      | eyeDetection |
+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+--------------+
| 4329.23  | 4009.23  | 4289.23  | 4148.21  | 4350.26  | 4586.15  | 4096.92  | 4641.03  | 4222.05  | 4238.46  | 4211.28  | 4280.51  | 4635.9   | 4393.85  | 0            |
| 4324.62  | 4004.62  | 4293.85  | 4148.72  | 4342.05  | 4586.67  | 4097.44  | 4638.97  | 4210.77  | 4226.67  | 4207.69  | 4279.49  | 4632.82  | 4384.1   | 0            |
| 4327.69  | 4006.67  | 4295.38  | 4156.41  | 4336.92  | 4583.59  | 4096.92  | 4630.26  | 4207.69  | 4222.05  | 4206.67  | 4282.05  | 4628.72  | 4389.23  | 0            |
| 4328.72  | 4011.79  | 4296.41  | 4155.9   | 4343.59  | 4582.56  | 4097.44  | 4630.77  | 4217.44  | 4235.38  | 4210.77  | 4287.69  | 4632.31  | 4396.41  | 0            |
| 4326.15  | 4011.79  | 4292.31  | 4151.28  | 4347.69  | 4586.67  | 4095.9   | 4627.69  | 4210.77  | 4244.1   | 4212.82  | 4288.21  | 4632.82  | 4398.46  | 0            |
+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+--------------+
```

Where:

| Column                                                                    | Description                                                                               | Data Type | Usage   |
| ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | --------- | ------- |
| `AF3`,`F7`,`F3`,`FC5`,`T7`,`P7`,`O1`,`O2`,`P8`,`T8`,`FC6`,`F4`,`F8`,`AF4` | The EEG measurement data.                                                                 | `float`   | Feature |
| `eyeDetectin`                                                             | The state of the patient's eye where `0` indicates open eye and `1` indicates closed eye. | `binary`  | Label   |
<Info>
**Labels and Features**
A **label** is a column whose values will be predicted (the y variable in simple
linear regression).
A **feature** is a column used to train the model (the x variable in simple
linear regression).
</Info>
## Training a Predictor
Let's create and train the machine learning model. For that, we use the
[`CREATE MODEL`](/sql/create/model) statement and specify the
input columns used to train `FROM` (features) and what we want to
`PREDICT` (labels).

The `eyeDetection` column is our target variable. The interesting thing about
this example is that we aim to forecast _labels_ that are not strictly
numerical. Even though this example is simple (because the variable is a binary
category), this can easily be generalized to more than two categories.

We order the measurements by the `Timestamps` column that shows readings
frequency of approximately 8 milliseconds.

```sql
CREATE MODEL mindsdb.eeg_eye_forecast
FROM example_db
  (SELECT * FROM demo_data.eeg_eye)
PREDICT eyedetection
ORDER BY timestamps
WINDOW 50
HORIZON 10;
```

As the sampling frequency is 8 ms, this predictor is trained using a historical
context of roughly 400 ms (`(50 * 8) = 400 [ms]`) to predict the following 80 ms
(`(10 * 8) = 80 [ms]`).

## Status of a Predictor

A predictor may take a couple of minutes for the training to complete. You can
monitor the status of the predictor by using this SQL command:

```sql
DESCRIBE eeg_eye_forecast;
```

If we run it right after creating a predictor, we get this output:

```sql
+------------+
| status     |
+------------+
| generating |
+------------+
```

A bit later, this is the output:

```sql
+----------+
| status   |
+----------+
| training |
+----------+
```

And at last, this should be the output:

```sql
+----------+
| status   |
+----------+
| complete |
+----------+
```

Now, if the status of our predictor says `complete`, we can start making
predictions!

## Making Predictions

You can make predictions by querying the predictor joined with the data table.
The [`SELECT`](/sql/api/select/) statement lets you make predictions for the
label based on the chosen features for a given time period. Usually, you want to
know what happens right after the latest training data point that was fed. We
have a special keyword for that, the `LATEST` keyword.

Let's run a query to get predictions for the next `HORIZON` timesteps into
the future, which in this case is roughly 80 milliseconds.
```sql
SELECT m.timestamps, m.eyedetection
FROM example_db.demo_data.eeg_eye as t
JOIN mindsdb.eeg_eye_forecast as m
WHERE t.timestamps > LATEST
LIMIT 10;
```
On execution, we get:
```sql
+----------------------------+--------------+
| timestamps                 | eyedetection |
+----------------------------+--------------+
| 2001-09-03 08:01:57.000000 | 1            |
| 2001-09-03 08:01:57.008000 | 1            |
| 2001-09-03 08:01:57.016000 | 1            |
| 2001-09-03 08:01:57.024000 | 1            |
| 2001-09-03 08:01:57.032000 | 1            |
| 2001-09-03 08:01:57.040000 | 1            |
| 2001-09-03 08:01:57.048000 | 1            |
| 2001-09-03 08:01:57.056000 | 1            |
| 2001-09-03 08:01:57.064000 | 1            |
| 2001-09-03 08:01:57.072000 | 1            |
+----------------------------+--------------+
```
That's it. We can now `JOIN` any set of `WINDOW` rows worth of
measurements with this predictor, and forecasts will be emitted to help us
expect a change in the state of the patient's eye based on the EEG readings.
## Alternate Problem Framings
It is also possible to reframe this task as a _normal_ forecasting scenario
where the variable is numeric. There are a few options here. It boils down to
what the broader scenario is and what format would maximize the value of any
specific prediction.
For example, a simple mapping of `eye is open` to `0` and `eye is closed` to `1`
would be enough to replicate the above behavior.
We could also explore other options. With some data transformations on the data
layer, we could get a countdown to the next change in state, effectively
predicting a _date_ if we cast this back into the timestamp domain.
## What's Next?

Have fun while trying it out yourself!

- Bookmark [MindsDB repository on GitHub](https://github.com/mindsdb/mindsdb).
- Sign up for a free [MindsDB account](https://cloud.mindsdb.com/register).
- Engage with the MindsDB community on
  [Slack](https://mindsdb.com/joincommunity) or
  [GitHub](https://github.com/mindsdb/mindsdb/discussions) to ask questions and
  share your ideas and thoughts.

If this tutorial was helpful, please give us a GitHub star
[here](https://github.com/mindsdb/mindsdb).
